Projected Sequential Gaussian Processes: A C++ tool for interpolation of heterogeneous data sets

نویسندگان

  • Remi Barillec
  • Ben Ingram
  • Dan Cornford
  • Lehel Csató
چکیده

Within MUCM there might occasionally arise the need to use large training set sizes, or employ observations with non-Gaussian noise characteristics or non-linear sensor models in a calibration stage. This technical report deals with Gaussian process models in these non-Gaussian, and / or large data set size cases. Treating such data within Gaussian processes is most naturally accomplished using a Bayesian approach, however such methods generally scale rather badly with the size of data set, and require computationally expensive Monte Carlo based inference in non-Gaussian settings. Recently within the machine learning and spatial statistics communities many papers have explored the potential of reduced rank representations of the covariance matrix, often referred to as projected or fixed rank approaches. In such methods the covariance function of the posterior process is represented by a reduced rank approximation which is chosen such that there is minimal information loss. In this paper a sequential Bayesian framework for inference in such projected processes is presented. The observations are considered one at a time which avoids the need for high dimensional integrals typically required in a Bayesian approach. A C++ library, psgp, which is part of the INTAMAP web service, is introduced which implements projected, sequential estimation and adds several novel features. In particular the library includes the ability to use a generic observation operator, or sensor model, to permit data fusion. It is also possible to cope with a range of observation error characteristics, including non-Gaussian observation errors. Inference for the covariance parameters is explored, including the impact of the projected process approximation on likelihood profiles. We illustrate the projected sequential method in application to synthetic and real data sets. Limitations and extensions are discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Projected sequential Gaussian processes: A C++ tool for interpolation of large datasets with heterogeneous noise

Heterogeneous data sets arise naturally in most applications due to the use of a variety of sensors, and measuring platforms. Such data sets can be heterogeneous in terms of the error characteristics, and sensor models. Treating such data is most naturally accomplished using a Bayesian or model based geostatistical approach, however such methods generally scale rather badly with the size of dat...

متن کامل

Spatial Interpolation Using Copula for non-Gaussian Modeling of Rainfall Data

‎One of the most useful tools for handling multivariate distributions of dependent variables in terms of their marginal distribution is a copula function‎. ‎The copula families capture a fair amount of attention due to their applicability and flexibility in describing the non-Gaussian spatial dependent data‎. ‎The particular properties of the spatial copula are rarely ...

متن کامل

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...

متن کامل

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...

متن کامل

Chronic heterogeneous sequential stress increases formalin-induced nociceptive

Abstract Introduction: Chronic heterogeneous stress may be better for evaluation of the effect of chronic stress situations on the nociceptive behaviour. The present study investigated the effects of chronic heterogeneous sequential stress on thermal-induced nociception and formalin induced pain behavior in rats. Methods: In the present study, adult rats (220-300 g) were used. Animals were ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010